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Description 

[0001] The invention generally relates to data com- 
pression, and more specifically relates to a form of en- 
tropy coding. 

[0002] In a typical audio coding environment, data is 
represented as a long sequence of symbols which is in- 
put to an encoder. The input data is encoded by an en- 
coder, transmitted over a communication channel (or 
simply stored), and decoded by a decoder. During en- 
coding, the input is pro-processed, sampled, converted, 
compressed or otherwise manipulated into a form for 
transmission or storage. After transmission or storage, 
the decoder attempts to reconstruct the original input. 
[0003] Audio coding techniques can be generally cat- 
egorized into two classes, namely the time-domain tech- 
niques and frequency-domain ones. Time-domain tech- 
niques, e.g., ADPCM, LPC, operate directly in the time 
domain while the frequency-domain techniques trans- 
form the audio signals into the frequency domain where 
the actual compression happens. The frequency-do- 
main codecs can be further separated into either sub- 
band or transform coders although the distinction be- 
tween the two is not always clear. Processing an audio 
signal in the frequency domain is motivated by both clas- 
sical signal processing theories and human perception 
models (e.g.,psychoacoustics). The inner ear, specifi- 
cally the basilar membrane, behaves like a spectral an- 
alyzer and transforms the audio signal into spectral data 
before further neural processing proceeds. 
[0004] The frequency-domain audio codecs often 
take advantage of many kinds of auditory masking that 
are going on with the human hear system to modify the 
original signal and eliminate a great many details/redun- 
dancies. Since the human ears are not capable of per- 
ceiving these modifications, efficient compression is 
achieved. Masking is usually conducted in conjunction 
with quantization so that quantization noise can be con- 
veniently "masked.* In modem audio coding techniques, 
the quantized spectral data are usually further com- 
pressed by applying entropy coding, e.g., Huffman cod- 
ing. 

[0005] Compression is required because a funda- 
mental limitation of the communication model is that 
transmission channels usually have limited capacity or 
bandwidth. Consequently, it is frequently necessary to 
reduce the information content of input data in order to 
allow it to be reliably transmitted, if at all, over the com- 
munication channel. Over time, tremendous effort has 
been invested in developing lossless and lossy com- 
pression techniques for reducing the size of data to 
transmit or store. One popular lossless technique is 
Huffman encoding, which is a particular form of entropy 
encoding. 

[0006] Entropy coding assigns code words to different 
input sequences, and stores all input sequences in a 
code book. The complexity of entropy encoding de- 
pends on the number m of possible values an input se- 



quence X may take. For small m, there are few possible 
input combi nations, and therefore the code book for the 
messages can be very small (e.g., only a few bits are 
needed to unambiguously represent all possible input 

5 sequences). For digital applications, the code alphabet 

is most likely a series of binary digits {0, 1), and code * 

word lengths are measured in bits. 

[0007] If it is known that input is composed of symbols 

having equal probability of occurring, an optimal encod- * 

io ing is to use equal length code words. But, it is not typical 
that an input stream has equal probability of receiving 
any particular message. In practice, certain messages 
are more likely than others, and entropy encoders take 
advantage of this to minimize the average length of code 

15 words among expected inputs. Traditionally, however, 
fixed length input sequences are assigned variable 
length codes (or conversely, variable length sequences 
are assigned fixed length codes). 
[0008] International Publication No. WO 98/40969 de- 

20 scribes a system for compressing an ASCII or similarly 
encoded text file. Pipelining of certain data compression 
algorithms is described in Baiiey and Mukkamala, 
"Pipelining Data Compression Algorithms", The Com- 
puter Journal, vol. 33 no. 4 (August 1990). An adaptive 

25 algorithm for lossless compression of digital audio is de- 
scribed in Shamoon and Heegard, "A Rapidly Adaptive 
Lossless Compression Algorithm for High Fidelity Audio 
Coding", Proc. IEEE Data Compression Conl, 430-39 
(1 994). A comparison of the H.261 and MPEG1 video 

so compression standards is described in von Roden, "H. 
261 andMPEGI - A Comparison", Proc. IEEE 15 th Ann. 
Int'l Conl On Computers and Comm., 65-71 (1996). 
[0009] The invention is defined by the subject matters 
of the independent claims. 

35 [001 0] Preferred embodiments are defined by the de- 
pendent claims. 

[0011] The invention concerns using a variable-to- 
variable entropy encoder to code an arbitrary input 
stream. A variable-to-variable entropy encoder codes 

40 variable length input sequences with variable length 
codes. To limit code book size, entropy^ype codes may 
be assigned to only probable inputs, and alternate 
codes used to identify less probable sequences. 
[0012] To optimize searching the code book, it may 

45 be organized into sections that are searched separately. 
For example, one arrangement is to group all stored in- 
put sequences in the book according to the first symbol 
of the input sequence. A hash encoding function, col- 
lection of pointers, or other method may be used to im- 

so mediately jump to a given section of the code book. 1 
Each section may further be sorted according to the 
probability associated with the entry. For example, each ^ 
. section may be sorted with highest probable inputs lo- 
cated first in the section, thus increasing the likelihood 

55 that a match will be found quickly. 

[001 3] Matching code book entries depends on the in- 
ternal representation of the book. For example, in a tree 
structure, nodes may represent each character of the 
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input such that reaching a leaf signifies the end and 
identification of a particular grouping of input symbols, 
in a table structure, a pattern matching algorithm can be 
applied to each table entry within the appropriate sec- 
tion. Depending on the implementation of the table and 
matching algorithms, searching may be facilitated by 
recognition that only as many input symbols as the long- 
est grouping in the code book section need to be con- 
sidered. After finding a code book match, the corre- 
sponding entropy-type code can be output and the 
search repeated with the next symbol following the 
matched input. 

[0014] The illustrated embodiments focus on encod- 
ing audio data, and the input stream is expected to be 
any data stream, such as numbers, characters, or a bi- 
nary data, that encodes audio. For simplicity, the input 
stream is referenced herein as a series of symbols, 
where each "symbol" refers to the appropriate measure- 
ment unit for the particular input. The input stream may 
originate from, local storage, or from Intranets, the In- 
ternet, orstreaming data (e.g., Microsoft's "NETSHOW m 
client/server streaming architecture). 

FIG. 1 is a block diagram of a computer system that 
may be used to implement variable to variable en- 
tropy encoding. 

FIG. 2 shows a basic communication model for 
transmitting streaming and non-streaming data. 
FIG. 3 is a flowchart showing creation of a code 
book having variable length entries for variable 
length symbol groupings. 

FIGS. 4-1 0 illustrate creation of a code book pursu- 
ant to FIG. 3 for an alphabet {A. B, C}. 
FIG. 11 shows encoding of audio data, 
FIG. 12 illustrates an entropy encoder. 

[001 5] The invention has been implemented in an au- 
dio/visual codec (compressor/de-compressor). This is 
only one example of how the invention may be imple- 
mented. The invention is designed to be utilized wher- 
ever entropy-type coding may be utilized, and is appli- 
cable to compression of any type of data. Briefly de- 
scribed, optimal entropy encoding requires excessive 
resources, and the illustrated embodiments provide a 
nearly optimal encoding solution requiring far fewer re- 
sources. 

Exemplary Operating Environment 

[0016] FIG. 1 and the following discussion are intend- 
ed to provide a brief, general description of a suitable 
computing environment in which the invention may be 
implemented. While the invention will be described in 
the general context of computer-executable instructions 
of a computer program that runs on a personal compu- 
ter, those skilled in the art will recognize that the inven- 
tion also may be implemented in combination with other 
program modules. Generally, program modules include 



routines, programs, components, data structures, etc. 
that perform particular tasks or implement particular ab- 
stract data types. Moreover, those skilled in the art will 
appreciate that the invention may be practiced with oth- 
er computer system configurations, including hand-held 
devices, multiprocessor systems, microprocessor- 
based or programmable consumer electronics, mini- 
computers, mainframe computers, and the like. The il- 
lustrated embodiment of the invention also is practiced 
in distributed computing environments where tasks are 
performed by remote processing devices that are linked 
through a communications network. But, some embod- 
iments of the invention can be practiced on stand alone 
computers. In a distributed computing environment, pro- 
gram modules may be located in both local and remote 
memory storage devices. 

[0017] With reference to FIG. 1 , an exemplary system 
for implementing the invention includes a computer 20, 
including a processing unit 21, a system memory 22, 
and a system bus 23 that couples various system com- 
ponents including the system memory to the processing 
unit 21 . The processing unit may be any of various com- 
mercially available processors, including Intel x86, Pen- 
tium and compatible microprocessors from Intel and 
others, the Alpha processor by Digital, and the PowerPC 
from IBM and Motorola. Dual microprocessors and other 
multi-processor architectures also can be used as the 
processing unit 21 . 

[0018] The system bus may be any of several types 
of bus structure including a memory bus or memory con- 
troller, a peripheral bus, and a local bus using any of a 
variety of conventional bus architectures such as PCI, 
AGP, VESA, MicroChannel, ISA and EISA, to name a 
few. The system memory includes read only memory 
(ROM) 24 and random access memory (RAM) 25. A ba- 
sic input/output system (BIOS), containing the basic 
routines that help to transfer information between ele- 
ments within the computer 20, such as during start-up, 
is stored in ROM 24. 

[0019] The computer 20 further Includes a hard disk 
drive 27, a magnetic disk drive 28, e.g., to read from or 
write to a removable disk 29, and an optical disk drive 
30, e.g., for reading a CD-ROM disk 31 or to read from 
or write to other optical media. The hard disk drive 27, 
magnetic disk drive 28, and optical disk drive 30 are con- 
nected to the system bus 23 by a hard disk drive inter- 
face 32, a magnetic disk drive interface 33, and an op- 
tical drive interface 34, respectively. The drives and their 
associated computer-readable media provide nonvola- 
tile storage of data, data structures, computer-executa- 
ble instructions, etc. for the computer 20. Although the 
description of computer-readable media above refers to 
a hard disk, a removable magnetic disk and a CD, it 
should be appreciated by those skilled in the art that oth- 
er types of media which are readable by a computer, 
such as magnetic cassettes, flash memory cards, digital 
video disks, Bernoulli cartridges, and the like, may also 
be used in the exemplary operating environment. 
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[0020] A number of program modules may be stored 
in the drives and RAM 25, including an operating system 
35, one or more application programs (e.g., Internet 
browser software) 36, other program modules 37, and 
program data 38. 

[0021] A user may enter commands and information 
into the computer 20 through a keyboard 40 and pointing 
device, such as a mouse 42. Other input devices (not 
shown) may include a microphone, joystick, game pad, 
satellite dish, scanner, orthe like. These and other input 
devices are often connected to the processing unit 21 
through a serial port interface 46 that is coupled to the 
system bus, but may be connected by other interfaces, 
such as a parallel port, game port or a universal serial 
bus (USB). A monitor 47 or other type of display device 
is also connected to the system bus 23 via an interface, 
such as a video adapter 48. In addition to the monitor, 
personal computers typically include other peripheral 
output devices (not shown), such as speakers and print- 
ers. 

[0022] The computer 20 is expected to operate in a 
networked environment using logical connections to 
one or more remote computers, such as a remote com- 
puter 49. The remote computer 49 may be a web server, 
a router, a peer device or other common network node, 
and typically includes many or all of the elements de- 
scribed relative to the computer 20, although only a 
memory storage device 50 has been illustrated in FIG. 
1 . The computer 20 can contact the remote computer 
49 over an Internet connection established through a 
Gateway 55 (e.g., a router, dedicated-line, or other net- 
work link), a modem 54 link, or by an intra-office local 
area network (LAN) 51 or wide area network (WAN) 52. 
It will be appreciated that the network connections 
shown are exemplary and other means of establishing 
a communications link between the computers may be 
used. 

[0023] In accordance with the practices of persons 
skilled in the art of computer programming, the present 
invention is described below with reference to acts and 
symbolic representations of operations that are per- 
formed by the computer 20, unless indicated otherwise. 
Such acts and operations are sometimes referred to as 
being computer-executed. It will be appreciated that the 
acts and symbolically represented operations include 
the manipulation by the processing unit 21 of electrical 
signals representing data bits which causes a resulting 
transformation or reduction of the electrical signal rep- 
resentation, and the maintenance of data bits at memory 
locations in the memory system (including the system 
memory 22, hard drive 27, floppy disks 29, and 
CD-ROM 31) to thereby reconfigure or otherwise alter 
the computer system's operation, as well as other 
processing of signals. The memory locations where da- 
ta bits are maintained are physical locations that have 
particular electrical, magnetic, or optical properties cor- 
responding to the data bits. 

[0024] FIG. 2 shows a basic communication model. 



In a basic communication model, there is a data source 
or sender 200, a communication channel 204, and a da- 
ta receiver 208. The source may be someone speaking 
on a telephone, over telephone wires, to another per- 

5 son. Or, the source may be a television or radio broad- 
cast transmitted by wireless methods to television or ra- 
dio receivers. Or, the source may be a digital encoding 
of audio data, transmitted over a wired or wireiess com- 
munication link (e.g., a LAN orthe Internet) to a corre- 

10 sponding decoder for the information. 

[0025] To facilitate transmission and receipt of the da- 
ta, an encoder 202 is used to prepare the data source 
for transmission over the communication channel 204. 
The encoder is responsible for converting the source da- 

is ta into a format appropriate for the channel 204. For ex- 
ample, in the context of a common telephone call, one's 
voice is typically converted by the phone's handset from 
voice sounds to analog impulses that are sent as analog 
data to local telephone receiving equipment. This ana- 

20 log signal is then converted into digital form, multiplexed 
with numerous other conversations similarly encoded, 
and transmitted over a common line towards the receiv- 
er. Thus, in FIG. 2, the channel 204 corresponds in large 
part to a common pathway shared by multiple senders 

25 and receivers. For network applications, the channel 
204 is commonly an intranet or the Internet. At the re- 
ceiving end 208, a decoder 206 is required to reverse 
the encoding process so as to present sensible data to 
the receiver. 

30 [0026] This simple model does not take into account, 
however, the real-world demands of application pro- 
grams. For example, a client (e.g., an application pro- 
gram) commonly wants to process, display or play re- 
ceived data in real-time as it is retrieved over a network 

35 link. To do so a streaming delivery system is required, L 
e. , an adaptive data transmission system that allows ap- 
plication-level bandwidth reservation for a data stream. 
Streaming environments contrast with traditional net- 
working programs, such as certain versions of Internet 

40 browsers that download web page content on a non-pri- 
oritized basis, and allow data content delivery over the 
network link 204 to be orchestrated (and optimized) for 
particular retrieval needs (such as a slow dial-up link). 
[0027] An exemplary streaming format (SF) Is the Mi- 

45 crosoft Active Streaming Format. Generally, a SF de- 
fines the structure of complex, synchronized object data 
streams. Any object can be placed into a SF data 
stream, including audio and video data objects, scripts, 
ActiveX controls, and HTML documents. SF data can 

so be stored in a file or created In real-time using audio and 
video capture hardware. An Application Programming 
interface (API) corresponding to an implementation of 
the SF can provide an application with support for de- 
livering and receiving streaming content. One such API 

55 is the Microsoft Audio Compression Manager (ACM), 
which provides functions for processing (e.g., com- 
pressing and delivering) audio data. Other networking 
APIs that can be used to support the SF include the Mi- 
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crosoft Win32 internet Extensions (Winlnet), WinSock, 
and TCP/IP APIs. (For more information see the 1998 
Visual Studio 6.0 MSDN Library.) Note that it is intended 
that processed data can be stored for later retrieval by 
a client, and that such retrieval can be performed in a 
non-streaming format (e.g., by a small playback appli- 
ance). 

[0028] To transmit streaming or non-streaming data, 
networks such as the Internet convert the source data 
into packet form suitable for the network. Packets gen- 
erally include routing information as well as the actual 
data. SF data streams are preferably made up of data 
packets that can be transmitted over any digital network 
by inserting them one at a time into the data field of net- 
work packets. Each SF packet may contain a prioritized 
mix of data from different objects within the stream, so 
that the bandwidth can be concentrated on higher prior- 
ity objects (or organized to optimize throughput). This 
data can be captured in real time, stored to nonvolatile 
storage, converted from existing audio or video formats, 
created by combining audio with pictures and scripts, or 
delivered over the network to a client program or viewer. 
The client receiver 208 of the streaming data can be a 
traditional "helper" application (for compatibility with the 
old Web publishing approach), or a more modem web 
page control (e.g., an ActiveX object) embedded in a 
web page. 

[0029] SF data streams are distinguished over tradi- 
tional network content as being viewed progressively in 
real time as a client receives it. Unfortunately, playback 
of streamed content becomes susceptible to transmis- 
sion delays. If data does not arrive reliably, or if trans- 
mission speed falls below an acceptable minimum, play- 
back of the content cannot continue at an acceptable 
rate. Smooth-streaming playback at a client requires 
that the transmission require a bandwidth less than the 
client's available bandwidth (e.g. the speed of the link 
204 less networking overhead). Typically a dial-up con- 
nection to the Internet provides a bandwidth of 28-34 
Kbps. Consequently, audiovisual source data (which is 
bandwidth intensive) must be significantly compressed 
to allow its transmission over low bit-rate connections. 
The degree of compression necessarily impacts the 
quality of the reproduced signal. Preferably a server pro- 
vides multiple sources optimized for different network- 
ing speeds, or utilizes an adaptive feedback system to 
perform real-time analysis of the client's actual through- 
put. 

[0030] Once SF data packets are encoded 202 and 
placed inside network packets and sent over the net- 
work 204, the routing technology of the network takes 
care of delivering the network packets to the receiver 
208. Preferably a variety of network and application pro- 
tocols, such as UDP, TCP, RTP, IP Multicast, IPX, and 
HTTP, are supported by the broadcast sender 200. 
[0031] As discussed above, bandwidth is limited and 
the encoder 202 generally must compress data prior to 
transmission. A particularly effective method for encod- 



ing source data frequency coefficients to ensure reliable 
transmission over a communication channel is entropy 
encoding. Entropy coding capitalizes on data coheren- 
cy, and is effective when symbols have non-uniform 

5 probability distribution. 

[0032] FIG. 3 is a flowchart showing a preferred meth- 
od for generating an entropy encoder's code book. In 
particular, in contrast with prior art techniques, FIG. 3 
illustrates how to create a code book having variable 

10 length code assignments for variable length symbol 
groupings. As discussed above, prior art techniques ei- 
ther require fixed-length codes or fixed blocks of input. 
Preferred implementations overcome the resource re- 
quirements of large dimension vector encoding, and the 

is inapplicability of coding into words of equal lengths, by 
providing an entropy based variable-to-variable code, 
where variable length code words are used to encode 
variable length X sequences. 

[0033] Let yi represent each source symbol group {xj} , 
20 for 1 < j <, Ni, having probability Pi of occurring within the 
input stream (FIG. 2 channel 204), and that each group 
is assigned a corresponding code word having Li bits. 
It is presumed that each xj is drawn from a fixed alphabet 
of predetermined size. The objective is to minimize the 

25 

equation L = X/ N . 



30 [0034] Instead of finding a general solution to the 
problem, the problem is separated into two different 
tasks. The first task is identification of a (sub-optimal) 
grouping of a set of input symbols {xi} through an em- 
pirical approach described below. The second task is 

35 assigning a entropy-type code for the grouped symbols 
{yi}. Note that it is known that if the source is not coherent 
(i.e., the input is independent or without memory), any 
grouping that has the same configuration of {Nj} can 
achieve the same coding efficiency. In this situation, the 

40 first task becomes inconsequential. 

[0035] To perform the first task, an initial trivial symbol 
grouping 300 is prepared, such as {yi} = {xi}. This initial 
configuration assumes that an exemplary input stream 
is being used to train creation of the code book. It is un- 

45 derstood that a computer may be programmed with soft- 
ware constructions such as data structures to track re- 
ceipt of each symbol from an input. Such data structures 
may be implemented as a binary-type tree structure, 
hash table, or some combination of the two. Other equiv- 

so alent structures may also be used. 

[0036] After determining the trivial grouping, the prob- 
ability of occurrence for each yi is computed 302. Such 
probability is determined with respect to any exemplary 
input used to train code book generation. As further 

55 symbols are added to the symbol data structure, the 
probabilities are dynamically adjusted. 
[0037] Next, the most probable grouping yi is identi- 
fied 304 (denoted as ymp). If 306 the highest probability 
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symbol is a grouping of previously lower probability sym- 
bols, then the grouping is split 308 into its constituent 
symbols, and processing restarted from step 302. (Al- 
though symbols may be combined, the group retains 
memory of all symbols therein so that symbols can be 
extracted.) 

[0038] If the symbol is not a grouping, then processing 
continues with step 310, in which the most probable 
grouping is then tentatively extended 310 with single 
symbol extensions xi's. Preferably ymp is extended with 
each symbol from the X alphabet used. However, a pre- 
dictor can be used to only generate an extension set 
containing only probable extensions, if the alphabet is 
very large and it is known many extensions are unlikely. 
For example, such a predictormay be based on seman- 
tic or contextual meaning, so that very improbable ex- 
tensions can be ignored a priori. 
[0039] The probability for each tentative expansion of 
ymp is then computed 312, and only the most probable 
extension retained 314. The rest of the lower probability 
extensions are collapsed together 316 as a combined 
grouping and stored in code book with a special symbol 
to indicate a combined grouping. This wild-card symbol 
represents any arbitrary symbol grouping having ymp 
as a prefix, but with an extension (suffix) different from 
the most probable extension. That is, if ymp + xmp is 
the most probable root and extension, then the other 
less probable extensions are represented as ymp*, * * 
xmp. (Note that this discussion presumes, for clarity, se- 
rial processing of single-symbol extensions; however, 
parallel execution of multiple symbol extensions is con- 
templated.) It is understood by one skilled in the art that 
applying single symbol extensions, and keeping only 
one most probable grouping, are restrictions imposed 
for clarity of discussion. It is further understood that al- 
though discussion focuses on serial processing, code 
book construction may be paralleled. 
[0040] Code book construction is completed by re- 
peating 318 steps 302-316 until all extensions have 
been made, or the number of the code book entries 
reaches a predetermined limit. That is, repeating com- 
puting probabilities for each current yi 302, where the 
code book set {Y} now includes ymp + xmp, and respec- 
tively choosing 304 and grouping the most and least 
likely extensions. The effect of repeatedly applying the 
above operations is to automatically collect symbol 
groupings having high correlation, so that inter-group 
correlation is minimized. This minimizes the numerator 
of L, while simultaneously maximizing the length of the 
most probable yi so that the denominator of L is maxi- 
mized. 

[0041] FIGS. 4-10 illustrate creation of a code book 
pursuant to FIG. 3 for an alphabet {A, B, C}. For this 
discussion, the code book is defined with respect to an 
exemplary input stream "AAABBAACABABBA 
B u . As discussed above, one or more exemplary inputs 
may be used to generate a code book that is then used 
by encoders and decoders to process arbitrary inputs. 



For clarity, the code book is presented as a tree struc- 
ture, although it may in fact be implemented as a linear 
table, hash table, database, etc. As illustrated, the tree 
is oriented left-to-right, where the left column (e.g., "A" 

5 and "XO") represents the top row of a tree-type structure, 
and successively indented rows represent the "children' 
of the previous row's node (e.g., in a top-down tree for 
FIG. 5, node "A" is a first-row parent node for a second- 
row middle-child- node "B".). 

to [0042] In preparing the code book, the general rule is 
to pick the most probable leaf node, expand it, re-com- 
pute probabilities to determine the most probable leaf- 
node, and then compact the remaining sibling nodes in- 
to a single Xn node (n=O..N, tracking each time nodes 

is have been combined). If it turns out that the most prob- 
able node is a group node, then the group is split, prob- 
abilities recalculated, and the most probable member 
node retained (i.e., the remaining group members are 
re-grouped). Processing cycles until a stop state is 

20 reached, such as a code book having predetermined 
size. 

[0043] FIG. 4 shows an initial grouping for the input 
stream "A A A B B A AC A B A B B A B". An initial parsing 
of the input shows probabilities of occurrence of A = 

25 8/15. B = 6/15, and C=1/15. This initial trivial grouping 
can be created based on different criteria, the simplest 
being having a first-level node for every character in the 
alphabet However, if the input alphabet is large, the triv- 
ial grouping may be limited to some subset of symbols 

30 having highest probability, where the remaining symbols 
are combined into an X grouping. FIG. 4 Illustrates this 
technique by starting with only two Initial groups, group 
A 400 having probability 8/1 5 t and group XO 402 having 
probability 7/15, where XO represents all remaining low 

35 probability symbols in the alphabet, e.g., B and C. 
[0044] After preparing an initial trivial grouping, the 
leaf-node having highest probability is selected for ex- 
tension (see also FIG. 3 discussion regarding process- 
ing sequence). Hence, as shown in FIG. 5, group A 400 

40 is tentatively expanded by each character in the alpha- 
bet (or one may limit the expansion to some subset 
thereof as described for creating the initial grouping). 
Probabilities are then recomputed with respect to the in- 
put stream "AAABBAACABABBAB"to determine 

45 values for the tentative extensions A 406, B 408, and C 
410. The result is nine parsing groups, where "A A" ap- 
pears 2/9, "A B" appears 4/9, and "A C" appears 0/9. 
Therefore, the most probable extension "a b" is retained 
and the other extensions collapsed into X1 =A,C. Note 

so that although this discussion repeatedly recalculates all 
probabilities, a more efficient approach is to retain prob- 
abilities and symbol associations for each node within 
the node, and only computing information as necessary. 
[0045] FIG. 6 shows the collapse into X1 for FIG. 5. 

55 Processing repeats with identification of the node hav- 
ing highest probability, e.g., node B 408 at probability 
4/9. 

[0046] As shown in FIG. 7, this node 408 is tentatively 
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extended with symbols A 41 4, B 41 6, C 41 8 t and as dis- 
cussed above, the tentative grouping with highest prob- 
ability is retained. After recalculating probabilities, the 
result is eight parsing groups in which the symbol se- 
quence "ABA" 41 4 appears once, "A B B u 41 6 appears 
once f and "A B C° 41 8 does not appear at all. Since ten- 
tative extensions A 41 4 and B 41 6 have the same prob- 
ability of occurrence, a rule needs to be defined to 
choose which symbol to retain. For this discussion, 
whenever there are equal probabilities, the highest row 
node (e.g., the left-most child node in a top-down tree) 
is retained. Similarly, when there is a conflict between 
tree rows, the left-most row's node (e.g., the node clos- 
est to the root of atop-down tree) is retained. 
[0047] Note that the above described parsing of the 
exemplary input does not account for the trailing two 
symbols "A B M of the input. As illustrated in FIG. 7, there 
is no leaf corresponding to "A B w as that configuration 
was expanded into "A B A", "A B B", and M A B C". To 
compensate, code book entries can be created to ac- 
count for such end of input sequences, or the input hav- 
ing no entry can be escaped with a special character 
and inserted in the encoded output stream. For exam- 
ple, a special symbol can be used to indicate end of in- 
put, therefore implying how to handle the trailing char- 
acters on decoding. 

[0048] Therefore, as shown in FIG. 8, node A 414 is 
retained and nodes B 41 6 and C 41 8 are combined into 
node X2 = B.C 420, having combined probability of 1/8 
+ 0/8. Now, the next step is to expand the node currently 
having highest probability with respect to the input 
stream. As shown, nodes X1 =A,C 412 and X0=B,C 402 
have the same probability of occurrence (3/8). As dis- 
cussed above, the highest node in the tree (X0 402) is 
extended. (Although it is only necessary to be consist- 
ent, it is preferable to expand higher level nodes since 
this may Increase coding efficiency by increasing the 
number of long code words.) 

[0049] However, X0 402 is a combined node, so it 
must be split instead of extended. FIG. 9 illustrates the 
result of splitting node X0 into its constituent symbols B 
422 and C 424. Recalculating probabilities indicates that 
symbol sequences "A B A" appears 1/8, "A B X2" ap- 
pears 1/8, "A X1 " appears 3/8, "B" 422 appears 2/8, and 
"C" appears 1/8. Since this is a split operation, the split 
node having highest probability, e.g, node B 422, is re- 
tained, and the remaining node(s) re-combined back in- 
to X0=C424. 

[0050] FIG. 10 shows the result of retaining high-prob- 
ability node B 422. Note that grouping X0 now only rep- 
resents a single symbol "C" . After revising probabilities, 
the node having highest probability must be identified 
and split or extended. As shown, symbol sequence "A 
B A" appears 1/8, "A B X2" appears 1/8, "A X1 ■ appears 
3/8, "B" appears 2/8, and "X0 P appears 1/8. Therefore 
node X1 412, as a combined node, must be split. 
[0051] Splitting proceeds as discussed above, and 
processing the code book cycles as illustrated in FIG. 



3, with highest probability nodes being extended or split 
until a stop state is reached (e.g. , the code book reaches 
a maximum size). Once the code book has reached a 
stop state, it is available for encoding data to transmit 

5 over a communication channel. Note that for the FIG. 
10 configuration, the average bits per input symbol, as- 
suming fractional bits under "ideal" scalar Huffman en- 
coding, is approximately 0.8 bits/symbol (varies de- 
pending on how the trailing input "A B° is handled). This 

10 represents a significant (about 1 0%) savings over pre- 
vious lossless compression techniques. 
[0052] FIG. 11 shows a transmission model for trans- 
mitting audio data over a channel 460. It is presumed 
that the channel 460 is of limited bandwidth, and there- 

15 fore some compression of source data 450 is required 
before the data can be reliably sent. Note that although 
this discussion focuses on transmission of audio data, 
the invention applies to transfer of other data, such as 
audio visual information having embedded audio data 

20 (e.g., multiplexed within an MPEG datastream), or other 
data sources having compressible data patterns (e.g., 
coherent data). 

[0053] As illustrated, source data 450 is input to a time 
/frequency transform encoder 452 such as a filter bank 

25 or discrete-cosine type transform. Transform encoder 
452 is designed so as to convert a continuous or sam- 
pled time-domain input, such as an audio data source, 
into multiple frequency bands of predetermined (al- 
though perhaps differing) bandwidth. These bands can 

30 then be analyzed with respect to a human auditory per* 
ception model 454 (for example, a psychoacoustlc mod- 
el) in order to determine components of the signal that 
may be safely reduced without audible impact. For ex- 
ample, it is well known that certain frequencies are in- 

35 audible when certain other sounds or frequencies are 
present in the input signal (simultaneous masking). 
Consequently, such inaudible signals can be safely re- 
moved from the input signal. Use of human auditory 
models is well known, e.g., the MPEG 1 , 2 and 4 stand- 

40 ards. (Note that such models may be combined into a 
quantization 456 operation.) 

[0054] After performing the time/frequency transfor- 
mation, frequency coefficients within each range are 
quantized 456 to convert each coefficient (amplitude 

45 levels) to a value taken from a finite set of possible val- 
ues, where each value has a size based on the bits al- 
located to representing the frequency range. The quan- 
tizer may be a conventional uniform or non-uniform 
quantizer, such as a midriser or midtreader quantizer 

so with (or without) memory. The general quantization goal 
is identifying an optimum bit allocation for representing 
the input signal data, I .©., to distribute usage of available 
encoding bits to ensure encoding the (acoustically) sig- 
nificant portions of the source data. Various quantization 

55 methods, such as quantization step size prediction to 
meet a desired bit rate (assuming constant bit rate) can 
be used. After the source 450 has been quantized 456, 
the resultant data is then entropy encoded 458 accord- 
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ing to the code book of FIG. 3. 
[0055] FIG. 12 shows one method for implementing 
the entropy encoder 458 of FIG. 11 through application 
of the code book of FIG. 3 to the quantized data. The 
code bookfor variable-to-variable encoding can be used 
to encode other types of data. As illustrated, the quan- 
tized data is received 480 as input to the entropy encod- 
er 458 of FIG. 11 . it is understood that the input is in 
some form of discrete signals or data packets, and that 
for simplicity of discussion, all input is simply assumed 
to be a long series of discrete symbols. The received 
input 480 is scanned 482 in order to locate 484 a corre- 
sponding code book key in the code book of FIG. 3. 
Such scanning corresponds to a data look-up, and de- 
pending on the data structure used to implement the 
code book, the exact method of look-up will vary. 
[0056] There are various techniques available for 
storing and manipulating the code book. One structure 
for a code book is traversal and storage of a N-ary (e. 
g. t binary, tertiary, etc.) tree, where symbol groupings 
guide a traversal of the tree structure. The path to a leaf 
node of the tree represents the end of a recognized sym- 
bol sequence, where a entropy-type code is associated 
with the sequence. (Note that the code book may be im- 
plemented as a table, where a table entry contains the 
entire input sequence, e.g., the path to the node.) Nodes 
can be coded in software as a structure, class definition, 
or other structure allowing storage of a symbol or sym- 
bols associated with the node, and association of a cor- 
responding entropy-type code 486. 
[0057] Alternatively, the code book may be structured 
as a table having each string of input symbol sorted by 
probability of occurrence, with highly probable input at 
the top of the table. For large tables, the table can be 
sorted according to the first symbol, i.e., all symbol se- 
ries beginning with "A" are grouped together, followed 
by series starting with "B", etc. With this arrangement, 
all entries within the grouping are sorted according to 
their probabilities of occurrence. The position of the be- 
ginning of each section is marked/tracked so that a 
hash-type function (e.g., a look-up based on the first 
symbol) can be used to locate the correct portion of the 
code booktable. In this look-up table approach to storing 
the code book, once the first symbol is hashed, then the 
corresponding table section is exhaustively searched 
until a matching entry is located. The code 484 associ- 
ated with the matching entry is then output 486 as the 
encoded substitute. 

[0058] Continuing now with FIG. 11 , once the output 
486 is known, this output is transmitted over the com- 
munication channel 460. The receiving end 470 then im- 
plements a reverse-encoding process, i.e., a series of 
steps to undo the encoding of the source data 450. That 
is, the encoded data 486 is received as input to an en- 
tropy decoder 462 which performs a reverse code book 
look-up to convert the encoded output 486 back into the 
original input symbol series 480 (FIG. 12). The recov- 
ered input data 480 is then processed by a de-quantizer 



464 and time/frequency transform decoder 466 to re- 
verse the original coding operations, resulting in a re- 
constructed data 468 that is similar to the original source 
data 450. It should be noted that the reconstructed data 
5 468 only approximates the original source data 450 
when, as it presumed herein, a lossy system is em- 
ployed. 

[0059] Having described and illustrated the principles 
of my invention with reference to an illustrated embodi- 
10 ment, it will be recognized that the illustrated embodi- 
ment can be modified in arrangement and detail without 
departing from such principles. 



15 Claims 

1. A method of encoding a sequence of digital audio 
data symbols (450) fortransmission over a commu- 
nications channel (460), the method comprising 
20 identifying a first variable size grouping of audio da- 
ta symbols within the sequence of digital audio data 
symbols, wherein the method is characterized by: 

coding (486) the first variable size grouping of 
25 symbols with a coding arrangement which pro- 

duces as output an entropy code word corre- 
sponding to the first grouping of symbols, 
wherein the coding utilizes a pre-constructed 
code book that associates variable size group- 
30 jngs of audio data symbols with corresponding 

entropy code words for variabie-to-variable 
compression; and 

repeatedly identifying and coding (486) subse- 
ts quent variable size groupings of symbols such 
that at least two identified groupings of symbols 
within the sequence of digital audio data sym- 
bols have differing sizes. 

40 2. The method of claim 1 in which the code book is 
organized into sections according to a first symbol 
of each grouping in the code book, where each sec- 
tion is further sorted by probability of occurrence of 
each entry within each section, and wherein identi- 

45 fying a variable size grouping comprises: 

identifying a section by a first symbol of the var- 
iable size grouping; and 
matching the variable size grouping against 
so each section entry until a match is found. 

3. The method of claim 2 wherein the match has a 
size, such size indicating a new position in the se- 
quence of digital audio data symbols for identifying 

55 a second variable size grouping of symbols. 

4. The method of claim 3 wherein the first and the sec- 
ond variable size grouping have differing sizes. 
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5. The method of claim 1 , wherein the code book is 
organized as a table having sections, such sections 
defined by a first symbol of each variable size 
grouping of symbols, and 

wherein table entries within a section are sorted ac- s 
cording to probability of occurrence of each variable 
size grouping of symbols within the section. 

6. The method of any of claims 1 to 5 wherein an input 
channel is in communication with a disk storage, the 10 
method further comprising the step of reading the 
sequence of digital data symbols from the disk stor- 
age, so as to allow identification and coding of the 
variable size groupings of symbols, 

1S 

7. The method of any of claims 1 to 5 further compris- 
ing: 

receiving the sequence of digital audio data 
symbols from an input channel; and 20 
after the coding (486) of a variable size group- 
ing of symbols, transmitting the entropy code 
word corresponding to the grouping of symbols 
over the communications channel (460). 

25 

8. The method of claim 7 wherein the sequence of dig- 
ital audio data symbols is received in real-time, and 
the transmitting is performed in real-time. 

9. The method of claim 7 further comprising receiving so 
a connection request from a client over a client net- 
work connection, wherein the transmitting is per- 
formed over the client network connection. 

1 0. The method of claim 9 wherein the sequence of dig- & 
ital audio data symbols Is received and stored in 
non-volatile storage, and the transmitting is delayed 
until receipt of the connection request from the cli- 
ent. 

40 

11. A method for decoding a compressed audio data 
stream from a communications channel (460) , com- 
prising: 



12. The method of claim 1 1 wherein the step of looking 
up the entropy code word includes hashing the en- 



tropy code word to obtain an index to an entry in a 
hash table. 

13. The method of daim 12 wherein the code book is 
organized into sections according to a first symbol 
of each variable size series of symbols stored within 
the code book. 

1 4. A computer readable medium having stored therein 
computer-executable instructions for causing a 
computer to perform the method of any of claims 1 
to 13. 

1 5 . A system for transmitting a compressed audiovisual 
data stream from a server network service to a client 
network service over a network link (460), such 
compressed audiovisual data stream formed by re- 
placing a variable size sequence of audiovisual da- 
ta input symbols with an output entropy code, the 
system comprising: 

an input buffer for storing a series of uncom- 
pressed audiovisual data symbols to compress 
and transmit to the client; 

an output memory for storing an entropy code 
word representing a compressed version of the 
series of uncompressed symbols in the input 
buffer; 

wherein the system is characterized by: 

a code book memory for storing a pre-existing 
code book containing an entropy code word for 
a variable size series of symbols, wherein the 
code book associates variable size series of 
symbols with corresponding entropy code 
words for variable-to-variable compression; 

a searching arrangement for looking up an en- 
tropy code word for a particular series of sym- 
bols in the code book; and 

an encoding arrangement having an input 
channel in communication with the input buffer 
and an output in communication with the output 
memory; 

wherein the encoding arrangement applies the 
searching arrangement to look up the entropy code 
word for the series of uncompressed symbols for 
storage in the output memory. 

16. The system of claim 15 further comprising transmis- 
sion means for transmitting the contents of the out- 
put memory to the client overthe network link (460). 

17. The system of claim 16 wherein a streaming net- 
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receiving an entropy code word; 

looking up the entropy code word in a pre-con- 
structed code book containing a correspond- 
ence between entropy code words and variable 
size series of symbols for variable-to-variable 
decompression; and 

outputting a variable size series of audio data 
symbols corresponding to the code word in the 
code book. 
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ordnung, die als Ausgang ein Entropie-Code- 
wort erzeugt, das der ersten Gruppierung von 
Symbolen entspricht, wobei das Codieren ein 
vorher konstruiertes Codebuch verwendet, das 
5 Gruppierungen variabler GroBe von Audioda- 

tensymbolen mit entsprechenden Entropie-Co- 
dewortem zur variabel-zu-variabel-Kompressi- 
on verbindet, und 

10 wiederholt Identifizieren und Codieren (486) 

von nachfolgenden Gruppierungen variabler 
GroBe von Symbolen, sodass wenigstens zwei 
identifizierte Gruppierungen von Symbolen In 
der Folge von digitalen Audiodatensymbolen 

15 unterschiedliche GroBen aufweisen. 

2. Verfahren nach Anspruch 1 , wobel das Codebuch 
entsprechend einem ersten Symbol jeder Gruppie- 
rung in dem Codebuch in Abschnitte gegiiedert ist, 

20 wobei jeder Abschnitt waiter nach der Wahrschein- 
lichkeit des Vorkommens jedes Eintrags innertialb 
jedes Abschnitts sortiert ist, und wobei das Identifi- 
zieren einer Gruppierung variabler GroBe umfasst 
Identifizieren eines Abschnitts durch ein erstes 

25 Symbol der Gruppierung variabler GroBe, und 

Abgleichen der Gruppierung variabler GroBe gegen 
jeden Abschnittseintrag, bis ein GegenstOck gefun- 
den ist. 

30 3. Verfahren nach Anspruch 2, wobei das Gegenstuck 
eine GroBe hat, wobei eine solche GroBe eine neue 
Position in der Folge von digitalen Audiodatensym- 
bolen zum Identifizieren einer zweiten Gruppierung 
variabler GroBe von Symbolen angibt 

35 

4. Verfahren nach Anspruch 3, wobei die erste und die 
zweite Gruppierung variabler GroBe unterschiedli- 
che GroBen aufweisen. 

40 5. Verfahren nach Anspruch 1 , wobel das Codebuch 
als eine Tabelle mit Abschnitten aufgebaut ist, wo- 
bei solche Abschnitte durch ein erstes Symbol jeder 
Gruppierung variabler GroBe von Symbolen defi- 
niert sind, und 

45 wobei Tabelleneintr&ge innerhalb eines Abschnitts 
entsprechend der Wahrscheinlichkeit des Vorkom- 
mens jeder Gruppierung variabler GroBe von Sym- 
bolen in dem Abschnitt sortiert sind. 

so 6. Verfahren nach einem der Anspruche 1 bis 5, wobei 
ein Eingangskanal mit einem Plattenspeicher in 
Verbindung steht, wobei das Verfahren des Weite- 
ren den Schrltt des Lesens der Folge von digitalen 
Datensymbolen aus dem Plattenspeicher umfasst, 

55 urn die Identifikation und Codierung der Gruppie- 
rungen variabler GroBe zu gestatten. 



working protocol is utilized to communicate over the 
network link (460). 

18. A system for decoding compressed audiovisual da- 
ta received from a network link (460), the system 
comprising: 

an input memory for storing a code word; 
an output buffer for storing a series of uncom- 
pressed audiovisual data symbols correspond- 
ing to the code word in the input memory; 

wherein the system is characterized by: 

a code book memory for storing a pre-existing 
code book containing a variable size series of 
symbols for an entropy code word, wherein the 
code book associates entropy code words with 
corresponding variable size series of symbols 
for variable-to-variable decompression; 

a searching arrangement for looking up an en- 
tropy code word for a particular series of sym- 
bols in the code book; and 

a decoding arrangement having an input chan- 
nel in communication with the input memory 
and an output in communication with the output 
buffer, 

wherein the decoding arrangement applies the 
searching arrangement to look up the series of un- 
compressed symbols corresponding to the entropy 
code word, and store such series in the output buff- 
er. 

19. The system of claim 18 wherein a streaming net- 
working protocol is utilized to communicate over the 
network link (460). 

20. The system of claim 18 in which the decoding ar- 
rangement is implemented by an application pro- 
gram that also implements the Hyper-Text Markup 
Language protocol. 



PatentansprOche 

1 . Verfahren zur Codierung einer Folge von digitalen 
Audiodatensymboten (450) zum Ubertragen Qber 
einen Kornmunikafionskanal (460), wobei das Ver- 
fahren das Identifizieren einer ersten Gruppierung 
variabler GrdBe von Audiodatensymbolen in der 
Folge von digitalen Audiodatensymbolen umfasst, 
wobei das Verfahren gekennzelchnet Ist durch: 

Codieren (486) der ersten Gruppierung varia- 
bler GroBe von Symbolen mit einer Codieran- 



7. Verfahren nach einem der Anspruche 1 bis 5, das 
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weiter umfasst 

Empfangen der Folge von digitalen Audiodaten- 
symbolen von einem Eingangskanal, und 
nach dem Codieren (486) einer Gruppierung varia- 
bler GroBe von Symbolen, Ubertragen des der 
Gruppierung von Symbolen entsprechenden Entro- 
pie-Codewortes Ober den Kommunikationskanal 
(460). 

8. Verfahren nach Anspruch 7, wobei die Folge von 
digitalen Audiodatensymbolen in Echtzeit empfan- 
gen wird, und das Obertragen in Echtzeit durchge- 
fuhrt wird. 

9. Verfahren nach Anspruch 7, das weiter das Emp- 
fangen einer Verbindungsanforderung von einem 
Client uber eine Client-Netzwericverbindung um- 
fasst, wobei das Obertragen uber die Client Netz- 
werkverbindung durchgefuhrt wird. 

10. Verfahren nach Anspruch 9, wobei die Folge von 
digitalen Audiodatensymbolen empfangen und in 
einem nicht fluchtigen Speicher gespeichert wird, 
und das Obertragen bis zum Empfang der Verbin- 
dungsanforderung von dem Client aufgeschoben 
wird. 

11. Verfahren zur Decodierung eines komprimierten 
Audiodatenstromes von einem Kommunikationska- 
nal (460), das umfasst: 

Empfangen eines Entropie-Codewortes; 

Nachsehen des Entropie-Codewortes in einem 
vorher konstruierten Codebuch, das eine Ent- 
sprechung zwischen Entropie-Codewortern 
und Serien variabler GroBe von Symbolen zur 
variabel-zu-variabel-Dekompression enthalt, 
und 

Ausgeben einer dem Codewort in dem Code- 
buch entsprechenden Serie variabler GroBe 
von Audiodatensymbolen. 

12. Verfahren nach Anspruch 11 , wobei der Schritt des 
Nachsehens des Entropie-Codewortes "Hashing" 
des Entropie-Codewortes einschlieBt, urn einen In- 
dex zu einem Eintrag in einer Hash-Tabetle zu er- 
langen. 

13. Verfahren nach Anspruch 12, wobei das Codebuch 
entsprechend einem ersten Symbol jeder in dem 
Codebuch gespeicherten Serie variabler GroBe von 
Symbolen in Abschnrtte gegliedert ist 

14. Computeriesbares Medium, in dem computeraus- 
fuhrbare Anweisungen gespeichert slnd, die einen 
Computer veranlassen, das Verfahren nach einem 



derAnspruche 1 bis 13 dunchzufuhren. 

15. System zum Ubertragen eines komprimierten au- 
diovisuellen Datenstromes von einem Server-Netz- 

s werkservice an einen Client-Netzwerkservice uber 
eine Netzwerkverbindung (460), wobei ein solcher 
komprimierter audiovisueller Datenstrom durch Er- 
setzen einer Folge variabler GroBe von audiovisu- 
ellen Dateneingangssymbolen durch einen Ausga- 

10 be-Entropiecode gebildet wird t wobei das System 
umfasst: 

einen Eingangspuff er zur Speicherung einer zu 
komprimierenden und an den Client zu ubertra- 
15 genden Serie von un komprimierten audiovisu- 

ellen Datensymbolen; 

einen Ausgangsspeicher zur Speicherung ei- 
nes Entropie-Codewortes. das eine kompri- 
20 mierte Version der Serie von unkomprimierten 

Symbolen in dem Eingangspuffer darstellt 

wobei das System gekennzeichnet ist durch: 

25 einen Codebuchspeicher zur Speicherung ei- 

nes vorher vorhandenen Codebuches, das ein 
Entropie-Codewort fur eine Serie variablerGro- 
Be von Symbolen enthalt wobei das Codebuch 
Serien variabler GroBe von Symbolen mit ent- 

30 sprechenden Entropie-Codewortern zur varia- 

bel-zu-variabel-Kompression verbindet; 

eine Suchanordnung zum Nachsehen eines 
Entropie-Codewortes fur eine bestimmte Serie 
35 von Symbolen in dem Codebuch, und 

eine Codieranordnung mit einem Eingangska- 
nal, der mit dem Eingangspuffer in Veibindung 
steht, und einem Ausgang, der mit dem Aus- 
40 gangsspeicher in Verbindung steht, 

wobei die Codieranordnung die Suchanordnung 
anwendet, um das Entropie-Codewort fur die Serie 
von umkomprimierten Symbolen zur Speicherung 
45 In dem Ausgangsspeicher nachzusehen. 

16. System nach Anspruch 16, das weiter eine Ober- 
tragungseinrichtung zum Obertragen des Inhalts 
des Ausgangsspeichers an den Client uber die 

so Netzwerkverbindung (460) umfasst. 

17. System nach Anspruch 16, wobei ein Strea- 
ming-Netzbetriebsprotokoll benutzt wird, um uber 
die Netzwerkverbindung (460) zu kommunizieren. 

55 

18. System zur Decodierung von komprimierten audio- 
visuellen Daten, die von einer Netzwerkverbindung 
(460) empfangen werden, wobei das System um- 
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fasst: 

einen Eingangsspeicher zur Speicherung ei- 
nes Codewortes; 

einen Ausgangspuffer zur Speicherung einer 
dem Codewort in dem Eingangsspeicher ent- 
sprechenden Serie von unkomprimierten au- 
diovisuellen Datensymbolen, 

wobei das System gekennzeichnet ist durch: 

einen Codebuchspeicher zur Speicherung ei- 
nes vorher vorhandenen Codebuches, das ei- 
ne Serie variabler GroBe von Symbolen fur ein 
Entropie-Codewort enthalt, wobei das Code- 
buch Entropie-Codeworter mit entsprechenden 
Serlen variabler Gr63e von Symbolen zur va- 
riabel-zu-variabe!-Dekompression verbindet; 

eine Suchanordnung zum Nachsehen eines 
Entropie-Codewortes fur elne bestimmte Serie 
von Symbolen in dem Codebuch, und 

eine Decodieranordnung mit einem Eingangs- 
kanal, der mit dem Eingangsspeicher in Verfoin- 
dung steht, und einem Ausgang, der mit dem 
Ausgangspuffer In Verbindung steht, 

wobei die Decodieranordnung die Suchanordnung 
anwendet, urn die dem Entropie-Codewort entspre- 
chende Serie von umkomprimierten Symbolen 
nachzusehen und eine solche Serie in dem Aus- 
gangspuffer zu speichem. 

19, System nach Anspruch 18, wobei ein Strea- 
ming-Netzbetriebsprotokoll benutzt wird, urn Ober 
die Netzwerkverbindung (460) zu kommunizieren. 

20. System nach Anspruch 18, in dem die Decodieran- 
ordnung durch ein Anwendungsprogramm Imple- 
mentiert ist, das auch das "Hyper-Text Markup Lan- 
guage"-Protokoll implementiert. 



Revindications 

1. Procede de codage d'une sequence de symboles 
de donnees audio numeriques (450) pour transmis- 
sion sur une voie de communication (460), le pro- 
cede comprenant d'identifier un premier groupe- 
ment de taille variable de symboles de donnees 
audio k I'interieur de la sequence de symboles de 
donnees audio numeriques, dans lequel le procede 
est caracterlse par les stapes consistant k : 

coder (486) le premier groupement de taille va- 
riable de symboles avec un arrangement de co- 



dage qui produit en sortie un mot de code en- 
tropique correspondent au premier groupe- 
ment de symboles, dans lequel le codage utili- 
se un livre de codes pr£construit qui associe 
5 des groupements de taille variable de symbo- 

les de donnees audio k des mots de code en- 
tropique pour une compression variable k 
variable ; et 

identifier et coder de manifcre repetitive (486) 
10 des groupements suivants de taille variable de 

symboles, de telle manidre qu'au moins deux 
groupements identifies de symboles dans la 
sequence de symboles de donnees audio nu- 
meriques aient des tailles diff&rentes. 

15 

2. Precede selon la revendication 1 , dans lequel le li- 
vre de codes est organise en sections selon un pre- 
mier symbole de chaque groupement dans le livre 
de codes, dans lequel chaque section est en outre 

20 triee par probability d'occurrence de chaque entree 
k I'interieur de chaque section et dans lequel Iden- 
tification d'un groupement de taille variable com- 
prend les etapes consistant k : 

25 identifier une section par un premier symbole 

du groupement de taille variable ; et 
comparer le groupement de taille variable k 
chaque entree de section jusqu'& ce qu'une 
correspondence sort trouvee. 

30 

3. Precede selon la revendication 2, dans lequel la 
correspondence a une taille, une telle taille indi- 
quant une nouvelle position dans la sequence de 
symboles de donnees audio numeriques pour ideri- 

35 tifier un second groupement de taille variable de 
symboles. 

4. Proced6 selon la revendication 3, dans lequel le 
premier et le second groupements de taille variable 

40 ont des tailles differentes. 

5. Procede selon la revendication 1 , dans lequel le li- 
vre de codes est organise sous la forme d'une table 
ayant des sections, ces sections etant definies par 

45 un premier symbole de chaque groupement de taille 
variable de symboles, et 

dans lequel tes entrees de la table a Pinterieur 
d'une section sont triees selon une probabilite d'oc- 
currence de chaque groupement de taille variable 

so de symboles k I'interieur de la section. 

6. Procede selon Tune quelconque des revendications 
165, dans lequel une voie cPentree est en commu- 
nication avec une memoire k disque, le procede 

55 comprenant en outre P6tape consistant k lire la se- 
quence de symboles de donnees numeriques k par- 
tir de la memoire k disque de maniere k permettre 
{'identification et le codage des groupements de 
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taille variable de symboles. 

7. Proc6d6 selon Tune quelconque des revendications 
165, comprenant en outre les Stapes consistant k : 

recevoir la sequence de symboles de donnees 
audio numeriques a partir d'une voie d'entree ; 
et 

aprfcs le codage (486) d'un groupement de 
taille variable de symboles, transmettre le mot 
de code entropique correspondent au groupe- 
ment de symboles sur la voie de communica- 
tion (460). 

8. Procede selon la revendication 7, dans lequel la se- 
quence de symboles de donnees audio numeriques 
est regue en temps reel et dans lequel la transmis- 
sion est effectuee en temps reel. 

9. Procede selon la revendication 7, comprenant en 
outre de recevoir une demande de connexion de- 
puis un client sur une connexion de reseau client, 
dans lequel la transmission est effectuee sur la con- 
nexion de r6seau client. 

1 0. Proc6d6 selon la revendication 9, dans lequel la se- 
quence de symboles de donnees audio numeriques 
est regue et stockee dans une memoire non volatile 
et dans lequel la transmission est retard6e jusqu'a 
la reception de la demande de connexion depuis le 
client. 

11. Proc6de de d6codage d'un train de donnees audio 
compressees provenant d'une voie de communica- 
tion (460), comprenant les etapes consistant k : 

recevoir un mot de code entropique ; 
rechercher le mot de code entropique dans un 
llvre de codes preconstruit contenant une cor- 
respondence entre des mots de code entropi- 
ques et des series de taille variable de symbo- 
les pour une decompression variable k 
variable ; et 

emettre une serie de taille variable de symboles 
de donnees audio correspondent au mot de co- 
de dans le livre de codes. 

12. Proc6d6 selon la revendication 11, dans lequel 
I'etape consistant k rechercher le mot de code en- 
tropique comprend de hacher le mot de code entro- 
pique pour obtenir un index vers une entree dans 
une table de hachage. 

13. Procede selon la revendication 12, dans lequel le 
llvre de codes est organise en sections selon un 
premier symbole de chaque serie de taille variable 
de symboles stock6 dans le livre de codes. 



14. Support exploitable par un ordinateur sur lequel 
sont stoctees des instructions ex£cutab!es par un 
ordinateur pour entraTner qu'un ordinateur execute 
le procede de Tune quelconque des revendications 

5 1613. 

15. Systeme de transmission d'un train de donnees 
audiovisuelles compress6es depuis un service de 
r6seau serveur vers un service de reseau client sur 

10 une liaison de reseau (460), un tel train de donnees 
audiovisuelles compressees etant forme en rem- 
plagant une sequence de taille variable de symbo- 
les d'entree de donnees audiovisuelles par un code 
entropique de sortie, le systeme comprenant : 

15 

un tampon d*entr6e pour stocker une serie de 
symboles de donnees audiovisuelles non com- 
pressees k compresser et k transmettre au 
client ; 

20 une memoire de sortie pour stocker un mot de 

code entropique representant une version 
compressee de la serie de symboles non com- 
presses dans le tampon d'entree ; 

25 dans lequel le systeme est caracterlse par : 

une memoire de livre de codes pour stocker un 
livre de codes preexistant contenant un mot de 
code entropique pour une serie de taille varia- 
30 bie de symboles, dans laquelle le livre decodes 

associe des series de taille variable de symbo- 
les k des'mots de code entropique correspon- 
dants pour une compression variable k 
variable ; 

35 un arrangement de recherche pour rechercher 

dans un mot de code entropique une serie par- 
ticuliere de symboles dans le livre de codes ; et 
un arrangement de codage ayant une voie 
d'entree en communication avec le tampon 

40 d'entree et une sortie en communication avec 

la memoire de sortie ; 

dans lequel I'arrangement de codage appli- 
que Parrangement de recherche pour rechercher 
45 dans un mot de code entropique la s6rie de symbo- 
les non compresses pour un stockage dans la me- 
moire de sortie. 

16. Systeme selon la revendication 15, comprenant en 
so outre des moyens de transmission pour transmettre 

le contenu de la memoire de sortie au client sur la 
liaison de reseau (460). 

17. Systeme selon la revendication 16, dans lequel un 
55 protocole de reseau de transmission en continu est 

utilise pour communiquer sur la liaison de reseau 
(460). 
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18. Syst&me pour decoder des donn6es audiovisuelles 
compressees repues depute une liaison de reseau 
(460) , le systeme comprenant : 

une m6moire d'entree pour stacker un mot de 5 
code ; 

un tampon de sortie pour stacker une serie de 
symboies de donn6es audiovisuelles non com- 
pressees correspondent au mot de code dans 
la memoire d'entree ; *° 

dans lequel le syst&me est caracterise par : 

une memoire de livre de codes pour stacker un 
llvre de codes preexistant contenant une serie 
de taflie variable de symboies pour un mot de 
code entropique, dans iequel te livre de codes 
associe des mots de code entropique k des se- 
ries correspondantes de taiile variable de sym- 
boies pour une decompression variable k 20 
variable ; 

un arrangement de recherche pour rechercher 
dans un mot de code entropique une serie par- 
ticuliere de symboies dans le livre de codes ; et 
un arrangement de decodage ayant une voie 25 
d'entree en communication avec la memoire 
d'entree et une sortie en communication avec 
le tampon de sortie ; 

dans lequel rearrangement de d6codage appli- 30 
que I'arrangement de recherche pour rechercher la 
s6rie de symboies non compresses correspondant 
au mot de code entropique et pour stacker une telle 
s6rie dans le tampon de sortie. 

35 

19. Systdme selon la revendication 18, dans lequel le 
protocole de r6seau de transmission en continu est 
utilise pour communiquer sur la liaison de reseau 
(460). 

40 

20. System e selon la revendication 1 8, dans lequel I'ar- 
rangement de d6codage est implements par un pro- 
gramme duplication qui implemente egalement le 
protocole de langage de balisage hypertexte. 
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